Instance Label Prediction by Dirichlet Process Multiple Instance Learning

نویسندگان

  • Melih Kandemir
  • Fred A. Hamprecht
چکیده

We propose a generative Bayesian model that predicts instance labels from weak (bag-level) supervision. We solve this problem by simultaneously modeling class distributions by Gaussian mixture models and inferring the class labels of positive bag instances that satisfy the multiple instance constraints. We employ Dirichlet process priors on mixture weights to automate model selection, and efficiently infer model parameters and positive bag instances by a constrained variational Bayes procedure. Our method improves on the state-of-the-art of instance classification from weak supervision on 20 benchmark text categorization data sets and one histopathology cancer diagnosis data set.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploiting Associations between Class Labels in Multi-label Classification

Multi-label classification has many applications in the text categorization, biology and medical diagnosis, in which multiple class labels can be assigned to each training instance simultaneously. As it is often the case that there are relationships between the labels, extracting the existing relationships between the labels and taking advantage of them during the training or prediction phases ...

متن کامل

Large Margin Metric Learning for Multi-Label Prediction

Canonical correlation analysis (CCA) and maximum margin output coding (MMOC) methods have shown promising results for multi-label prediction, where each instance is associated with multiple labels. However, these methods require an expensive decoding procedure to recover the multiple labels of each testing instance. The testing complexity becomes unacceptable when there are many labels. To avoi...

متن کامل

Multiple-Instance Learning of Real-Valued Data

The multiple-instance learning model has received much attention recently with a primary application area being that of drug activity prediction. Most prior work on multiple-instance learning has been for concept learning, yet for drug activity prediction, the label is a real-valued affinity measurement giving the binding strength. We present extensions of k-nearest neighbors (k-NN), Citation-k...

متن کامل

Active Learning with Multi-Label SVM Classification

Multi-label classification, where each instance is assigned to multiple categories, is a prevalent problem in data analysis. However, annotations of multi-label instances are typically more timeconsuming or expensive to obtain than annotations of single-label instances. Though active learning has been widely studied on reducing labeling effort for single-label problems, current research on mult...

متن کامل

Dirichlet-Bernoulli Alignment: A Generative Model for Multi-Class Multi-Label Multi-Instance Corpora

We propose Dirichlet-Bernoulli Alignment (DBA), a generative model for corpora in which each pattern (e.g., a document) contains a set of instances (e.g., paragraphs in the document) and belongs to multiple classes. By casting predefined classes as latent Dirichlet variables (i.e., instance level labels), and modeling the multi-label of each pattern as Bernoulli variables conditioned on the wei...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014